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Abstract 

A  target  is  hidden  in  one  of  several  possible  locations,  and  the  objective  is  to  find 
the  target  as  fast  as  possible.  One  common  measure  of  effectiveness  for  the  search 
process  is  the  expected  time  of  the  search.  This  type  of  search  optimization  problem 
has  been  addressed  and  solved  in  the  literature  for  the  case  where  the  searcher  has 
imperfect  sensitivity  (possible  false  negative  results),  but  perfect  specificity  (no  false 
positive  detections).  In  this  paper,  which  is  motivated  by  recent  military  and  homeland 
security  search  situations,  we  extend  the  results  to  the  case  where  the  search  is  subject 
to  false  positive  detections. 
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1  Introduction 

Discrete  search  problems  have  been  out  of  vogue  for  over  two  decades.  However,  recent 
defense  problems,  such  as  searching  for  a  hostage  hidden  in  a  city  (e.g.,  relatively  recent 
events  in  the  Gaza  strip)  or  detecting  improvised  explosive  devices  (IED)  in  Iraq,  have 
underscored  the  need  for  efficient  and  effective  search  methods  for  detecting  targets  of  various 
types. 

We  consider  a  surveillance  system,  the  purpose  of  which  is  to  find  a  target  that  is  hidden 
in  one  out  of  n  possible  locations.  The  target  location  is  uncertain  but  there  is  some  prior 
information  that  is  quantified  in  a  prior  probability  distribution.  The  surveillance  system 
comprises  a  sensor  and  a  verification  team.  The  sensor,  which  searches  sequentially  the  n 
locations,  is  imperfect  and  therefore  its  cues  are  subject  to  errors.  The  verification  team, 
which  makes  no  errors,  investigates  positive  detections  by  the  imperfect  sensor  and  verifies  if 
they  are  true  or  false.  Such  a  search  process  takes  time,  and  the  objective  is  to  find  a  search 
policy  that  minimizes  the  expected  search  time  until  the  target  is  found  or  optimizes  some 
other  measures  of  effectiveness  (MOEs)  such  as  the  probability  of  detection. 

Discrete  search  problems  of  the  type  mentioned  above  are  not  new.  Optimal  whereabout 
is  studied  in  [1]  and  [6].  Chew  [3]  considers  an  optimal  search  with  stopping  rule  where 
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all  search  outcomes  are  independent,  conditional  on  the  location  of  the  searched  object  and 
the  search  policy.  In  another  paper  Chew  [4]  considers  a  discrete  search,  where  the  target 
may  not  be  present  in  one  of  the  searched  cells  and  the  problem  is  when  to  stop  the  search. 
Similar  problems  are  discussed  in  [8].  Wegener  [12]  investigates  a  search  process  where 
the  search  time  of  a  cell  depends  on  the  number  of  searches  so  far.  A  minimum  cost  search 
problem-similar  to  the  one  presented  above-is  discussed  in  [9],  where  only  one  search  mode  is 
considered  and  the  sensor  has  perfect  specificity.  Other  discrete  search  problems  are  studied 
in  [2,  7,  13,  10].  Stone  [11]  gives  a  comprehensive  and  detailed  analysis  of  both  maximum 
probability  and  minimum  cost  search  models.  All  of  the  aforementioned  references  assume 
that  the  sensor  has  perfect  specificity,  that  is,  if  it  records  a  detection,  the  target  is  found. 
Our  model  relaxes  this  assumption  and  extends  classical  discrete  search  theory  to  incorporate 
false- positive  errors,  which  are  realistic  phenomena  in  many  defense  and  homeland  security 
situations.  While  some  authors  (e.g.,  Danskin  [5])  have  considered  the  effect  of  false  positive 
detections  in  the  presence  of  decoys,  to  the  best  of  our  knowledge,  our  model  is  the  first  direct 
generalization  of  the  classical  discrete  search  problem.  Incorporating  imperfect  specificity 
necessitates  the  introduction  of  an  investigation  stage  following  a  detection. 

The  specific  contributions  of  this  paper  are: 

•  We  show  that  a  greedy  policy  is  optimal  when  each  positive  detection  by  the  search 
sensor  is  followed  by  an  investigation  by  the  verification  team. 

•  We  derive  the  expected  time  of  the  search  under  the  above  conditions. 

•  If  the  verification  time  is  significantly  longer  than  the  search  time,  an  alternative  MOE 
is  the  probability  that  the  first  detection  is  a  true  one.  We  show  that  a  greedy  policy 
is  optimal  for  this  MOE,  too. 

•  Under  certain  realistic  search  conditions,  we  show  that  the  greedy  search  rule  is  also 
uniformly  optimal. 

The  paper  is  organized  as  follows.  In  Section  2,  we  introduce  notation  and  formulate  the 
problem.  In  Section  3,  we  prove  the  optimality  of  a  certain  greedy  rule  for  the  minimum-cost 
search  problem.  In  Section  4,  we  examine  two  special  cases  of  the  model  and  show  that  a 
greedy  rule  is  optimal  also  for  probability  oriented  objectives.  Section  5  gives  a  summary  of 
the  results  and  briefly  discusses  future  research. 


2  Operational  Motivation,  Statement  of  the  Problem 
and  Notation 

The  operational  setting  of  our  model  can  be  demonstrated  by  the  following  scenario.  A 
hostage  is  hidden  in  a  place  (e.g.,  house)  located  in  one  of  n  possible  area  cells  (AC)  (e.g., 
city  blocks).  See  Figure  1.  The  objective  is  to  locate  the  hostage  as  quickly  as  possible.  An 
imperfect  sensor  searches  the  ACs  one  at  a  time.  Following  a  detection,  which  identifies  a 
place  in  the  AC  (e.g.,  an  address  of  a  house)  where  the  hostage  may  be,  a  ground  verification 
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and  rescue  team  is  sent  to  that  place  to  verify  the  detection  and,  if  positive,  rescue  the 
hostage.  There  are  three  possible  types  of  detection. 

1.  Perfect  Detection-.  The  sensor  identifies  correctly  the  place  (address)  where  the  hostage 
is  kept. 

2.  Partial  Detection :  The  sensor  correctly  identifies  the  AC  where  the  hostage  is  held, 
but  incorrectly  identifies  the  specific  place  of  captivity. 

3.  False  Detection :  The  hostage  is  not  hidden  in  the  AC  where  the  sensor  has  recorded  a 
detection. 


detection  at 
this  address 


Figure  1:  A  city  is  divided  into  15  ACs,  and  a  search  in  AC  11  yields  a  detection  at  a  specific 
address. 

If  the  ground  team  is  sent  to  a  wrong  address  in  a  certain  AC,  it  continues  searching  the 
rest  of  the  AC  and  if  the  hostage  is  hidden  there  (the  case  of  partial  detection)  he  will  be 
found  by  the  team.  If  the  hostage  is  not  hidden  in  that  AC  (the  case  of  false  detection),  the 
AC  is  declared  to  be  cleared  and  therefore  removed  from  further  search  by  the  sensor. 

Let  9  be  the  parameter  that  describes  the  AC  where  the  target  is  hidden;  that  is,  9  =  i 
when  AC  i  contains  the  target.  Given  the  prior  probability  mass  function  (p.m.f.)  of  9 , 
7r  :  {1, . . . ,  n}  — >  [0, 1],  we  write  7 r*  =  P(9  =  i). 

Recall  that  the  sensor  is  imperfect.  Let  pi  =  P(  Sensor  indicates  detection  in  AC  i  \  9  = 
?');  that  is,  p-,  is  the  probability  that  the  sensor  identifies  correctly  the  AC  where  the  hostage  is 
hidden.  Let  r,  =  P(Sensor  indicates  detection  and  identifies  correctly  the  place  in  AC  i\  9  = 
i).  Clearly,  r?;  <  pt .  and  pi  —  r.;  is  the  probability  that  the  sensor  identified  the  wrong  place  in 
the  AC  where  the  hostage  is  hidden.  Finally,  let  %  =  P(Sensor  indicates  detection  in  AC  i  \  9 
?');  1  —  qi  is  the  specificity  of  the  sensor  in  AC  I.  We  assume  throughout  that  p*  >  q*  without 
loss  of  generality,  because  we  can  reverse  the  cue  of  the  sensor  if  Qi  >  Pi- 

Given  a  prior  p.m.f.  7 r,  we  select  an  action  a( tt)  G  {1, . . . ,  n}  that  indicates  the  AC  to  be 
searched  next.  An  action  0(77)  /  9  results  in  one  of  two  possible  outcomes:  a  no  detection 
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or  a  false  detection.  Following  either  of  these  outcomes,  posterior  probabilities  are  obtained 
and  the  prior  p.m.f.  of  0  is  updated.  In  the  case  of  a  no  detection,  the  posterior  p.m.f., 
na  M  =  (n~i,  •  •  • ,  II^Htt),  is  given  by 


(1  -pg)Tla 

1  -Pa^a  -  ga(  1  -  TTfl)’ 

(!  ~ 

.  1  ~  Pa^a  ~  qa(l  ~  Ka)’ 


if  j  =  a; 
if  j  ±  a. 


(1) 


Considering  the  case  of  a  false  detection,  the  posterior  p.m.f.,  denoted  by  fl+  M  =  (na,l,  •  •  •  ,n+J(7T), 

is 


0 


1  -  7Ta 


if  j  =  a: 
if  j  j-  a. 


(2) 


In  either  case,  no  detection  or  false  detection,  we  update  the  prior  7r  by  11“  (7r)  and  11+ (7r), 
respectively.  This  way  a  sequence  of  priors  is  obtained  until  a  true  detection  occurs. 

The  time  it  takes  the  sensor  to  search  AC  i  is  c, .  In  case  of  perfect  detection,  the  rescue 
team  completes  the  rescue  mission  in  C[ 1  *  time  units.  In  case  of  partial  detection,  the  length 
of  the  rescue  mission  is  C l'2'1  >  C-  L\  In  case  of  false  detection,  the  comprehensive  verification 

(3) 

time  by  the  ground  team  in  AC  i  before  it  is  declared  to  be  clear  is  C\  .  The  objective 
function  of  the  searcher  is  to  minimize  the  expected  total  time  it  takes  to  rescue  the  hostage. 

To  formulate  the  optimal  search  policy,  first  note  that  the  total  time  it  takes  to  rescue 
the  hostage  can  be  broken  into  two  parts: 


1.  Search  Time :  The  time  spent  to  locate  the  AC  in  which  the  hostage  is  hidden,  in¬ 
cluding  the  comprehensive  verification  time  spent  in  a  wrong  AC  following  a  false 
detection. 


2.  Rescue  Time:  The  time  it  takes  to  find  and  rescue  the  hostage  after  locating  the  AC 
in  which  the  hostage  is  hidden,  either  by  perfect  or  partial  detection. 

Recall  that  0  denotes  the  AC  that  contains  the  hostage.  Conditional  on  9  =  i.  the  rescue 
time  takes  on  values  €'■ '  *  or  C'f2)  depending  on  whether  the  detection  in  AC  i  is  perfect  or 
partial.  Therefore,  the  conditional  expected  rescue  time  is  equal  to 


—  ^  C  ^,(2) 


Pi 


Pi 


c 


Because  at  the  beginning  of  the  search,  there  is  probability  7 q  that  the  hostage  is  hidden  in 
AC  i ,  the  expected  total  rescue  time  is 


?;=i 


_j_  P'  C  ^  .(2) 

Pi  *  Pi 


(3) 


which  is  a  constant,  invariant  to  the  search  policy.  It  follows  that  we  can  formulate  an 
equivalent  objective  function,  which  is  to  minimize  the  expected  search  time  until  correctly 
detecting  the  AC  in  which  the  hostage  is  hidden ,  either  by  a  perfect  or  partial  detection. 
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Note  that  in  this  equivalent  objective  function,  the  parameters  C'\ ' * .  and  C-2\  which 
only  concern  the  rescue  operation,  but  do  not  affect  the  search  time,  do  not  appear.  In  other 
words,  the  optimal  policy  is  invariant  with  respect  to  these  parameters.  Throughout  the  rest 
of  the  paper,  we  use  the  equivalent  modified  objective  function,  and  let  Ct  =  cj ^  to  simplify 
the  notation.  In  addition,  hereafter  we  use  the  terms  search  and  look  interchangeably  and 
the  hostage  is  called  target. 


3  The  Optimal  Search  Policy 

A  search  policy  a  is  a  sequence  of  actions  adapted  to  the  sequence  of  priors;  in  our  case  each 
action  depends  only  on  the  latest  prior  p.m.f.  ir.  Let  Ta( tt)  be  the  expected  search  time 
until  detecting  the  correct  AC,  if  the  prior  p.m.f.  is  7r  and  the  searcher  follows  search  policy 
a.  Given  two  policies  o\  and  02,  we  write  o\  02  when  Tai( 7r)  <  Ta2( 7r). 

The  main  result  of  this  paper  is  presented  in  the  next  theorem. 


Theorem  3.1  Given  a  prior  p.m.f.  n  for  6,  the  optimal  search  policy  follows  a  greedy  rule 
where  the  AC  to  search  next  is  one  having  the  maximal  value  of 


PiKj 

Cj  A  Qi  Cj 


i  =  1,2,...,  n. 


We  call  the  search  policy  in  Theorem  3.1  the  greedy  rule ,  because  each  time  we  search  in 
the  AC  that  has  the  maximal  ratio  between  the  probability  of  finding  the  hostage  and  the 
expected  (wasted)  cost  due  to  a  false  detection.  Theorem  3.1  generalizes  the  case  of  perfect 
specificity,  where  the  qf  s  all  equal  zero;  see,  for  example,  [9]  and  [11]. 

To  facilitate  the  proof  of  Theorem  3.1,  we  introduce  two  alternatives  to  express  a  feasible 
policy.  First,  let 

a  1 )  u-2,  0.3,  a  1 . 

G,  G,  G,  G, 

denote  a  feasible  policy  such  that  the  searcher  first  follows  the  order  a\,  a2, . . .,  until  the  first 
detection  takes  place.  If  the  first  detection  correctly  locates  the  target  (either  perfect  detec¬ 
tion  or  partial  detection),  then  the  problem  ends.  If  the  first  detection  is  a  false  detection, 
then  the  searcher  switches  to  the  greedy  rule  thereafter.  Second,  let 

a  1,  02 1 

b,  Ct,  Ct,  Ct, 

denote  a  policy  similar  to  the  previous  one,  with  the  exception  that  if  the  first  search  in  AC 
a  1  results  in  a  false  detection,  then  the  searcher  is  required  to  search  in  AC  b  immediately 
before  switching  to  the  greedy  rule. 

Lemma  3.1  Consider  two  policies 

t  (  f  Ji  C  3 :  O.  /j  . 

1  ~  {  j,  Ct,  Ct,  Ct, 


5 


and 


S,= 


3,  h  "3;  "l,  •••  \ 

i,  G,  G,  G,  . . . ;  ' 

For  any  x,  that  is,  the  expected  search  time  with  policy  hi  is  shorter  than  with 

policy  8\,  if  and  only  if 


Pi^i 


> 


Pj^j 


Cj  T  Qi.Ci  Cj  +  QjCj 


Proof:  Let 


8  = 


^3?  ^4? 


G,  G,  ...  J- 
By  conditioning  on  the  location  of  the  target,  we  can  write 

TSl  (7r)  =  a  +  x, 

+7Tj 


(1  -  Pi)  ly  Cj  +  q3  (Q  +  Tg{ n+  II,  S ~)  )J  +  (1  -  qj)Tg(Uj II,  (tt)) 

Qi  (pi  +  ci  +  (!  -  Pj^c.iW,  n+  (xii)  +  (1  -  q;)  (Cj  +  (1  -  PjYIJU,  H,  (x)j) 

+(i  -  ~  "/)  <u(pi  +  <:>  +  <rM‘,  +  rG(n+n+(x)))  +  (1  -  i iy  1 1/  (-))) 
+(i  -  Qi)  (cj  +  q.j{('j  +  rG(n+n.r (tt)))  +  (i  -  ^ ii;  n,  M) 


where  Tg(-)  denotes  the  expected  search  time  with  the  greedy  rule.  Interchanging  the  in¬ 
dices  i  and  j  we  get  an  expression  for  T^,  (x).  Because  Iljn^(x)  =  n,  n j  (x),  n+n,  (x)  = 
II,  1I,;  (x).  and  11 .  11/  (  x)  =  n,+n-  (x),  taking  the  difference  we  have  that 

T5l(x)  -  F.,(x)  =  -TTiPifc.j  +  qjCj )  +  TTjPjici  +  qiGi). 

From  the  last  equation  the  result  follows  immediately.  □ 

We  next  present  the  proof  of  Theorem  3.1. 

Proof  of  Theorem  3.1 

The  proof  is  based  on  induction  on  the  number  of  ACs.  The  theorem  is  trivially  true  for 
n  =  1. 

Suppose  that  the  greedy  rule  is  optimal  if  there  are  n  —  1  or  fewer  ACs.  Next  we  show 
that  it  is  also  optimal  when  there  are  n  ACs.  Without  loss  of  generality  let 


PiXi 


=  max 


P;~i 


C\  T  q\C\  i— Cj  T  q%C \ 


(4) 


We  consider  a  class  of  search  policies  in  which  AC  1  is  searched  only  following  r  —  1  no¬ 
detection  searches  elsewhere;  that  is,  a*  /  1  for  i  =  1, . . . ,  r  —  1  and  ciT  =  1.  Let  AT  denote 
the  set  of  these  policies.  We  first  deal  with  the  case  r  <  oo. 

Let 

1 ,  cii ,  a  2 ,  . . .  \ 


Cl  = 


G,  G,  G,  ...  r 


6 


((  I  r  1, 

G,  G,  G, 


C2  — 


Cr  = 


■  )  (It— 1,  1,  (XT-|_i,  ...  \ 

.,  G,  G,  G,  ...)■ 

From  Equation  (4),  the  induction  hypothesis,  and  Lemma  3.1  we  have 


Ol,  <22, 

G,  G, 


Cl  A? 


1 ,  0  | ,  0  2  'i 

cti,  G,  G, 


A.  C2. 


Hence,  Ci  Att  (2-  Repeating  this  argument  we  can  see  that  Cl  Att  (2  A?r  -  -  Att  Cr-  In 
particular,  this  implies  that  0  Att  Cr-  In  other  words,  we  show  that  for  any  policy  in  Ar, 
with  r  <  00,  we  can  find  a  better  policy  that  starts  with  AC  1. 

Our  previous  argument  shows  that  (T^)  is  a  nondecreasing  real  sequence,  so  that  < 
.  Hence,  for  any  policy  in  A^  the  expected  search  time  does  not  increase  by  starting  the 
search  on  AC  1.  Consequently,  it  is  optimal  to  first  search  in  AC  1.  □ 


To  carry  out  the  optimal  policy  in  practice,  we  first  use  the  following  algorithm  to  generate 
the  search  order  if  all  the  searches  thus  far  resulted  in  no  detection. 


1.  Set  rn  =  1. 

2.  Choose  a  such  that 

Pet'ka.  Pj^j 

- —  =  max -  '  , 

Ca  A  Qa^' a  j  Cj  A  QjCj 

and  let  em  =  a. 


3.  Update  n  as  follows: 


7 Ta  <r~ 


TTj  <- 


(1  -pa)na 

1  -Pa^a  ~  qa(  1  -  77a)’ 

(!  ~ 

1  ~Pa^a  ~  qa(  1  -  7 Ta)’ 


4.  Let  rn  m  +  1,  and  go  to  2. 


3  ^  a- 


Let  e  =  denote  the  search  order  generated  by  this  algorithm.  If  following 

the  optimal  policy  the  first  rn  searches  all  resulted  in  no  detection,  then  it  is  optimal  to 
next  search  AC  em+ 1 .  Now  suppose  that  the  first  m  —  1  searches  resulted  in  no  detection, 
and  the  mth  search  in  AC  a  (em  =  a)  results  in  a  false  detection.  To  see  how  we  can 
use  e  to  find  which  AC  to  search  next,  note  that  the  ratio  of  the  posterior  probabilities 
n^W/H^M,  j  /  a  and  k  /  a ,  remains  unchanged,  regardless  of  the  search  outcome 
in  AC  a.  Therefore,  according  to  Theorem  3.1,  whether  the  mth  search  in  AC  a  results  in 
no  detection  or  in  a  false  detection,  the  relative  positions  for  all  the  ACs  other  than  a  in  e 
remain  unchanged.  Consequently,  with  the  optimal  policy,  we  simply  continue  to  follow  e 
by  skipping  those  ACs  that  have  gone  through  a  comprehensive  verification  due  to  a  false 
detection. 
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4  Special  Cases 

In  this  section,  we  examine  two  special  cases  of  the  search  model,  which  represent  two  extreme 
cases  of  the  search  scenario:  one  where  the  investigation  process  is  risky  and  extremely  long 
compared  to  the  search  time,  and  the  other  when  the  effect  of  the  investigation  time  is 
negligible. 

4.1  Risky  and  Very  Long  Investigation  Process 

Suppose  that  sending  out  ground  units  to  investigate  a  detection  is  a  risky  and  complex 
operation  that  may  take  a  very  long  time  compared  to  the  search  time,  that  is,  C,; 
c, .  i  =  1,  ...,n.  In  this  case,  the  objective  would  be  to  minimize  the  chance  of  false  positive 
detections,  and  consequently,  the  MOE  would  be  the  probability  that  the  first  detection  is  a 
true  one.  The  optimal  policy  in  this  case  is  greedy,  too,  as  shown  in  the  following  theorem. 

Theorem  4.1  Given  a  prior  p.m.f.  tt  for  6,  the  optimal  search  policy  that  maximizes  the 
probability  that  the  first  detection  is  a  true  one,  follows  a  greedy  rule  where  the  AC  to  search 
next  is  one  having  the  maximal  value  of 


Proof:  Because  the  objective  is  to  maximize  the  probability  that  the  first  detection  is  a  true 
detection,  a  feasible  policy  is  a  sequence  of  ACs,  such  that  the  searcher  follows  this  sequence 
until  a  detection  occurs.  Consider  two  policies  4|  =  ( i,j ,  a3,  a  i . . . .)  and  d2  =  (A  '>■  «3,  «i •  •  •  •  ), 
and  let  S  =  (a3,  a  i . . . .). 

Let  Vs(ir)  denote  the  probability  that  the  first  detection  is  a  true  detection,  if  the  prior 
p.m.f.  is  7 r  and  the  searcher  follows  search  policy  S.  By  conditioning  on  the  location  of  the 
target,  we  can  write 

L  (tt)  =  7 n  (p.i  +  (1  -  Pi) (1  -  qfiVfiUj  11,  (tt))) 

+*j  (' 1  -  Qi)Pj  +  (!  -  C)M  -  Pj)Vg(IlJ nr  (tt))) 

+  (1  -  *i  -  "yH1  -  ftX1  -  V/I'AH,:  11;  (tt)). 

Interchanging  the  indices  i  and  j  we  get  an  expression  for  {it).  Because  njllr(7r)  = 
nrnj  (7r),  taking  the  difference  we  have  that 

V5i(tt)  -  \%{tt)  =  7 TiPiQj  ~  7 TjPjQi. 

Therefore,  Vgfiir)  >  Vg2( tt)  if  and  only  if  >  P-,7T:,/qr  The  rest  of  the  proof  follows  the 

steps  as  in  Theorem  3.1  because  we  can  always  find  a  better  policy  than  S ,  if  d  does  not  start 
with  the  AC  that  has  the  maximal  value  of  \)~,/(\,.  i  =  1, . . . ,  n.  □ 

Note  that  the  greedy  rule  of  Theorem  4.1  is  also  a  special  case  of  Theorem  3.1  when 
C,  =  C.  i  =  1 .... .  n  and  C  — »  oo.  In  other  words,  if  Ct  ( '  q,  i  1, ...,  n,  then  the  greedy 
rule  of  Theorem  4.1  also  minimizes  the  expected  time  to  detection. 
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4.2  Uniform  Search  Time,  No  Effect  of  Investigation  Time 

Suppose  that  while  the  investigative  sensor  performs  its  investigation,  the  search  sensor  can 
proceed  in  searching  other  ACs.  However,  we  assume  that  due  to  operational  constraints 
and  the  high  risk  associated  with  the  investigation  operation,  a  verification  team  can  be 
sent  out  to  investigate  an  AC  only  upon  a  detection  cue  by  the  sensor.  Given  an  infinite 
capacity  of  investigating  resources,  the  problem  is  to  find  the  search  policy  that  minimizes 
the  number  of  searches  until  the  correct  AC  is  detected  by  the  search  sensor  (which  will  be 
immediately  followed  by  an  investigation  that  will  confirm  the  detection).  This  scenario  can 
be  represented  by  our  model  by  letting  c*  =  1  and  C*  =  0  for  i  =  1, //. 

In  this  situation,  the  greedy  search  policy  given  in  Section  3  not  only  minimizes  the 
expected  number  of  looks  until  the  target  is  detected,  but  it  is  also  uniformly  optimal.  A 
discrete  search  policy  is  said  to  be  uniformly  optimal  (see  e.g.,  [11],  p.  104)  if  it  maximizes 
the  probability  of  detecting  the  correct  AC  for  any  given  number  of  available  looks.  Stone 
[11]  showed  this  result  for  the  case  when  the  specificity  of  the  sensor  is  perfect,  that  is, 
q.i  =  0,  i  =  1, ...,  n.  We  extend  this  result  to  the  case  where  cp  >  0. 

Theorem  4.2  If  c,  =  1  and  Gi  =  0  for  i  =  1,  then  the  greedy  rule  in  Theorem  3.1  is 
uniformly  optimal. 

Proof:  Let  7 r  =  (tti ,  7T2, . . . ,  7rn)  denote  the  prior  p.m.f.  of  the  target’s  location,  where 
7r,  =  P(6  =  n.  Without  loss  of  generality,  suppose 


7TrPi  =  max  7r jPj.  (7) 

First  note  that  the  theorem  is  trivially  true  if  the  searcher  is  allowed  only  one  look. 

Suppose  there  are  k  >  2  looks  available  and  consider  two  searchers  A  and  B  with  the 
same  prior  p.m.f.  7r  =  (7Ti, . . . ,  7rn).  Recall  that  A’s  policy  can  be  represented  by  a  sequence 
of  actions  a  =  (cq,  a2, . . . ,  cq),  where  «.;(•)  maps  the  updated  p.m.f.  of  the  target’s  location 
to  the  AC  for  A’s  Ah  look.  Suppose  A  does  not  start  in  AC  1;  that  is,  ai(n)  /  1.  To  prove 
the  theorem,  we  will  show  that  B  can  do  at  least  as  well  as  A  by  first  searching  in  AC  1. 
The  theorem  then  follows  due  to  Equation  (7). 

To  do  so,  consider  the  following  policy  for  B:  First  search  in  AC  1.  If  the  search  results 
in  a  true  detection,  then  the  search  ends;  otherwise,  instead  of  updating  the  p.m.f.  of  the 
target’s  location,  let  B  keep  the  original  prior  p.m.f.  ir.  Starting  from  the  second  look, 
continue  the  search  with  cq,  a2, . . .,  and  update  the  p.m.f.  of  the  target’s  location  according 
to  Equations  (1)  and  (2)  along  the  way.  The  search  under  cq,a2,...  continues  until  B  is 
instructed  by  a | .  a2, . . .  to  search  in  AC  1  for  the  first  time  (besides  the  very  first  look  in  AC 
1).  At  that  point,  do  not  search  in  AC  1;  instead,  update  the  p.m.f.  of  the  target’s  location 
according  to  the  outcome  from  the  very  first  search  in  AC  1.  Say  am  instructs  B  to  search 
in  AC  1  for  the  first  time,  then  starting  in  the  (rn  +  l)st  look  let  B  follow  am+ 1,  am+2, . . . ,  cq 
throughout  the  rest  of  the  search. 

In  order  to  show  that  the  probability  of  B  finding  the  target  in  k  looks  is  no  less  than  that 
of  A,  we  couple  the  location  of  the  target  6  and  the  search  outcomes  for  the  two  searchers, 
such  that  A’s  Ith  look  in  AC  i  yields  the  same  outcome  as  B’s  /th  look  in  AC  i.  for  1  =  1,2,..., 
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and  i  =  1, . . . ,  n.  By  doing  so,  we  can  see  that  with  probability  ttiPi,  B  finds  the  target  in 
his  first  look.  If  B  continues  after  the  first  look,  then  in  each  sample  path,  A’s  /tli  look  and 
its  outcome  will  be  identical  to  B’s  (i  +  l)st  look  and  its  outcome,  i  =  1,2,  ...,  as  long  as 
A  has  not  yet  searched  in  AC  1.  When  A  searches  in  AC  1  for  the  first  time,  either  A  finds 
the  target  (in  which  case  B  finds  the  target  in  his  very  first  search  because  of  stochastic 
coupling),  or  thereafter  both  searchers  will  look  at  the  same  ACs  throughout  the  search 
process. 

To  compare  the  probability  that  each  searcher  can  find  the  target  within  k  looks,  we 
consider  two  cases: 

1.  Searcher  A  searches  in  AC  1  on  or  before  his  kth  look:  In  this  case,  B  finds 
the  target  within  k  looks  if  and  only  if  A  does,  so  the  probability  of  finding  the  target 
within  k  looks  is  identical  for  both  searchers. 

2.  Searcher  A  never  searches  in  AC  1  during  his  first  k  looks:  In  this  case,  with 
probability  iripi  B  finds  the  target,  and  A  does  not.  The  only  situation  where  A  finds 
the  target  within  k  looks  and  B  does  not  is  if  A  finds  it  on  his  kth  look  in,  say,  AC 
i.  Because  AC  i  may  have  been  searched  a  few  times  during  A’s  first  k  —  1  looks,  the 
probability  that  A  finds  the  target  on  the  kth  look  is  bounded  by  7r,p, .  Therefore,  the 
probability  that  A  finds  the  target,  but  not  B,  is  bounded  by 

max  7 npi  <  7Tipi, 

«=2,.. .,n 

where  the  inequality  follows  from  Equation  (7).  Therefore,  the  probability  of  B  finding 
the  target  within  k  looks  is  at  least  as  large  as  that  of  A. 

The  preceding  discussion  shows  that  for  k  >  1,  there  exists  a  feasible  policy  that  starts 
with  AC  1  and  maximizes  the  probability  that  the  target  will  be  found  within  k  looks.  Hence, 
the  greedy  rule  is  uniformly  optimal.  □ 


5  Summary  and  Conclusions 

In  this  note  we  extend  previous  results  concerning  discrete  searches  to  the  case  where  the 
searcher  has  imperfect  specificity.  In  that  case,  the  imperfect  searcher  is  coupled  with  a 
perfect,  time-consuming,  investigating  agent  that  verifies  detection  cues  by  the  searcher.  A 
simple  greedy  rule  is  developed,  which  is  proven  to  be  optimal  when  the  objective  is  to 
minimize  the  expected  time  to  detection.  The  expected  search  time  of  this  greedy  search 
is  calculated  and  some  numerical  analysis  is  provided.  For  certain  situations,  it  is  shown 
that  the  greedy  rule  maximizes  a  probability  objective  and  is  uniformly  optimal.  Note, 
however,  that  we  assume,  as  in  previous  works  on  this  topic,  that  the  time  of  a  transition 
from  one  AC  to  another  is  zero.  In  many  situations  (e.g.,  unmanned  aerial  vehicle  searching 
a  road  for  IEDs)  this  may  not  be  the  case  and  travel  time  must  be  accounted  for  explicitly. 
Incorporating  travel  time  in  this  search  model  leads  to  more  complex  dynamic  programming 
settings  that  will  be  explored  in  future  research. 
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